Citation Block Determination Using Textual Coherence

نویسندگان

  • Dain Kaplan
  • Takenobu Tokunaga
  • Simone Teufel
چکیده

Detecting the boundaries of citations in the running text of research papers is an important task for research paper summarisation, idea attribution, sentiment analysis, and other citation-based analysis research. Recently, detecting non-explicit citing sentences has garnered some attention, but can still be seen as in its infancy. We define this task as citation block determination (CBD). In this paper we propose and investigate the effects of various types of textual coherence on CBD, positing that it is a crucial aspect of identifying citation blocks, as it is fundamental to the composition of citations themselves. We demonstrate promising results, with our method outperforming previous state-of-the-art on F1 by a large margin, with an improvement in both precision and recall, and further provide an in-depth error analysis and discussion of why this is the case.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately?

In the past several years studies have started to appear comparing the accuracies of various science mapping approaches. These studies primarily compare the cluster solutions resulting from different similarity approaches, and give varying results. In this study, we compare the accuracies of cluster solutions of a large corpus of 2,153,769 recent articles from the biomedical literature (2004-20...

متن کامل

Diversity versus Channel Knowledge at Finite Block- Length Citation

We study the maximal achievable rate R∗(n, ) for a given block-length n and block error probability over Rayleigh block-fading channels in the noncoherent setting and in the finite block-length regime. Our results show that for a given blocklength and error probability, R∗(n, ) is not monotonic in the channel’s coherence time, but there exists a rate maximizing coherence time that optimally tra...

متن کامل

Citation Matching in Sanskrit Corpora Using Local Alignment

Citation matching is the problem of finding which citation occurs in a given textual corpus. Most existing citation matching work is done on scientific literature. The goal of this paper is to present methods for performing citation matching on Sanskrit texts. Exact matching and approximate matching are the two methods for performing citation matching. The exact matching method checks for exact...

متن کامل

Scene Determination Using Auditive Segmentation Models of Edited Video

This chapter describes different approaches that use audio features for determination of scenes in edited video. It focuses on analysing the sound track of videos for extraction of higher-level video structure. We define a scene in a video as a temporal interval which is semantically coherent. The semantic coherence of a scene is often constructed during cinematic editing of a video. An example...

متن کامل

Hybrid clustering for validation and improvement of subject-classification schemes

0306-4573/$ see front matter 2009 Elsevier Ltd doi:10.1016/j.ipm.2009.06.003 * Corresponding author. Address: K.U. Leuven, Ce E-mail address: [email protected] A hybrid text/citation-based method is used to cluster journals covered by the Web of Science database in the period 2002–2006. The objective is to use this clustering to validate and, if possible, to improve existing journ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JIP

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2016